Outlier detection from multiple data sources
نویسندگان
چکیده
The preparation of a dataset by merging multiple data sources using the fusion method may lead to loss vital information from each multi-source and certain amount correlative among sources. Based on extensive analysis “the unique characteristics” outliers, we propose outlier detection techniques reliably identify outliers in datasets. Several real-world examples are considered classify into three types (Type I-III) depending correlation We design baseline algorithm, which is an intuitive solution, optimal algorithm known as multiple-data-sources oriented (MOD) obtain high-score outliers. In addition, build MOD+ speed up process. A new density metric combining kNN RNN introduced evaluate deviation degrees applied develop outlier-join operators. MOD adept at (1) mining one datasets (2) sensing these implement algorithms synthetic experimental results demonstrate that proposed methods promising practical context detecting
منابع مشابه
Spatial Outlier Detection from GSM Mobility Data
With the rigorous growth of cellular network many mobility datasets are available publically, which attracted researchers to study human mobility fall under spatio-temporal phenomenon. Mobility profile mining is main task in spatio-temporal trend analysis which can be extracted from the location information available in the dataset. The location information is usually gathered through the GPS, ...
متن کاملCertifying Data from Multiple Sources
Data integrity can be problematic when integrating and organizing information from many sources. In this paper we describe efficient mechanisms that enable a group of data owners to contribute data sets to an untrusted third-party publisher, who then answers users’ queries. Each owner gets a proof from the publisher that his data is properly represented, and each user gets a proof that the answ...
متن کاملOutlier Detection with Uncertain Data
In recent years, many new techniques have been developed for mining and managing uncertain data. This is because of the new ways of collecting data which has resulted in enormous amounts of inconsistent or missing data. Such data is often remodeled in the form of uncertain data. In this paper, we will examine the problem of outlier detection with uncertain data sets. The outlier detection probl...
متن کاملOutlier detection for skewed data
Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the Stahel-Donoho outlyingness. The latter approach assigns to each observation a measure of outlyingne...
متن کاملOutlier Detection in Multivariate Data
The objective of this research is detection of outliers in multivariate data employing various distance measure, particularly using robust regression diagnosis technique. Several classical outlier identification methods are based on the sample mean and covariance matrix in general. But they do not always yield better result, as they themselves are affected by the outliers. Sometimes one outlier...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Sciences
سال: 2021
ISSN: ['0020-0255', '1872-6291']
DOI: https://doi.org/10.1016/j.ins.2021.09.053